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Abstract 

Background: Changes in gene regulatory networks drive the evolution of phenotypic diversity both within and 
between species. Rewiring of transcriptional networks is achieved either by changes to transcription factor binding 
sites or by changes to the physical interactions among transcription factor proteins. It has been suggested that the 
evolution of cooperative binding among factors can facilitate the adaptive rewiring of a regulatory network. 

Results: We use a population-genetic model to explore when cooperative binding of transcription factors is favored 
by evolution, and what effects cooperativity then has on the adaptive re-writing of regulatory networks. We consider a 
pair of transcription factors that regulate multiple targets and overlap in the sets of target genes they regulate. We 
show that, under stabilising selection, cooperative binding between the transcription factors is favoured provided the 
amount of overlap between their target genes exceeds a threshold. The value of this threshold depends on several 
population-genetic factors: strength of selection on binding sites, cost of pleiotropy associated with protein-protein 
interactions, rates of mutation and population size. Once it is established, we find that cooperative binding of 
transcription factors significantly accelerates the adaptive rewiring of transcriptional networks under positive 
selection. We compare our qualitative predictions to systematic data on Socchoromyces cerevisioe transcription factors, 
their binding sites, and their protein-protein interactions. 

Conclusions: Our study reveals a rich set of evolutionary dynamics driven by a tradeoff between the beneficial 
effects of cooperative binding at targets shared by a pair of factors, and the detrimental effects of cooperative binding 
for non-shared targets. We find that cooperative regulation will evolve when transcription factors share a sufficient 
proportion of their target genes. These findings help to explain empirical pattens in datasets of transcription factors in 
Saccharomyces cerevisiae and, they suggest that changes to physical interactions between transcription factors can 
play a critical role in the evolution of gene regulatory networks. 



Background 

It is often difficult for a population to acquire an adaptive 
phenotype that requires simultaneous changes in the co- 
expression of multiple genes [1-10]. If selection favours a 
change in the way a group of genes are regulated, then 
each of the target genes must independently gain novel 
binding sites and/or lose existing ones [8,9,11,12]. This 
has led to the proposal that adaptive rewiring of a regu- 
latory network can be accelerated if pairs of transcription 
factors bind their targets cooperatively, through a physical 
interaction between the transcription factor proteins 
themselves [8,9,11]. 
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Given the potential adaptive benefit of cooperative reg- 
ulation, it makes sense to ask, when will cooperative 
binding between a pair of transcription factors be able 
to invade a population that lacks such cooperativity? To 
answer this we must understand the following tradeoff: 
although cooperative binding between a pair of factors 
may result in improved regulation at the target genes 
shared by both factors, any mutation that results in a 
physical interaction between the transcription factors will 
effect all of their targets (Figure 1). Thus the advantageous 
fitness effects of improved binding at some, shared targets 
must outweigh any deleterious effects of misregulation at 
other, non-shared targets in order for cooperative binding 
to be favored by evolution. 

A number of previous studies have explored the mech- 
anistic details of cooperative transcription factor binding 
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Figure 1 Schematic of the population-genetic model. A schematic cartoon of our population-genetic model, (top) When cooperativity is absent 
different transcription factors (gray and red) must bind to sites at each of their targets independently. Each factor has a number of targets, K] and K 2l 
and a number fi(Kl + K2) of shared targets (bottom). When cooperativity is present, a physical interaction between transcription factors (blue line) 
can mitigate the need to bind independently at shared targets, but may cause misregulation at targets that are not shared, by causing the factor 
with which it interacts cooperatievly to misbind. Cooperatively is therefore advantageous between transcription factors that share many targets, but 
it may be deleterious at targets that are not shared. 



at a given target gene [13-15]. Such biophysical studies 
focus on transcription factor binding at a single target 
gene and are able, with remarkable accuracy, to account 
for a number of the physical properties of binding sites 
[14,16,17]. However, the evolution of cooperative binding 
occurs through mutations at transcription factor proteins, 
and such mutations can alter transcription factor binding 
at every binding site across the genome. To understand 
the fitness effects of such a mutation therefore requires 
that we understand the evolution of the whole ensemble 
of binding sites for a transcription factor. The population 
genetics of such an ensemble cannot be understood in 
a simple way just by focusing on the details of a single 
member of the ensemble. They depend critically on the 
population-genetic parameters of the ensemble, such as 
number of target genes, overall mutation rates and selec- 
tion coefficients, and population size. Therefore in this 



paper, we do not focus on the details of a cooperative 
binding at a single target gene. Instead our analysis is in 
terms of these population-genetic parameters, and whilst 
we estimate selective coefficients from biophysical studies, 
we do not specify the mechanistic details of protein-DNA 
interactions that give rise to them. 

We use a mathematical model to study the conditions 
under which cooperative binding between pairs of tran- 
scription factors is favoured. We first determine the evolu- 
tionary conditions that favour cooperative binding under 
stabilising selection, in terms of the basic evolutionary 
parameters of the population: the strength of selection on 
binding sites, the rate of mutation, and the population 
size. We then study the influence of cooperative regula- 
tion on the capacity for a transcriptional circuit to adapt 
under positive selection. We calculate the time required 
for a target gene to gain a new, adaptive transcription 
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factor binding site, in the presence or absence of cooper- 
ative interactions among its regulators. We confirm our 
analytical results on the evolution of cooperative regu- 
lation by comparison to Monte-Carlo simulations of the 
Wright-Fisher process associated with our system, and 
we compare our qualitative conclusions to systematic 
empirical data. 

Our population-genetic model describes a pair of tran- 
scription factors, each with its own set of target genes, 
with some degree of overlap between these sets (Figure 1). 
According to our model, which is specified in detail below, 
a target gene that is regulated by both factors has two 
corresponding binding sites, while a target gene that is 
regulated by only one of the factors has a single bind- 
ing site. We assume that mutations that result in loss 
of function can occur at any binding site, and that non- 
functional binding sites can also undergo gain of func- 
tion mutations. When there is no cooperative regulation 
between the two transcription factors, binding to each of 
their targets is determined solely by their binding sites. 
If a binding site is not functional, this results in reduced 
fitness. When cooperative binding is present, two conflict- 
ing effects occur: On the one hand, cooperative binding 
partially compensates for the deleterious effects of loss of 
function mutations to the binding sites at shared targets. 
On the other hand, cooperative binding results in some 
degree of mis-regulation at each of the targets that are not 
shared, and this has a deleterious impact on fitness. By 
constructing our model in terms of these fitness benefits 
and costs we are able to study the evolutionary dynam- 
ics of the system, and determine the effects of varying 
different population-genetic parameters on the evolution 
of cooperative gene regulation. This approach therefore 
complements the detailed mechanistic models of gene 
regulation that have been studied elsewhere [13-15]. 

Results and discussion 

Stabilising selection without cooperative binding 

We consider a pair of transcription factors, labelled 1 and 
2, that have K\ and I<2 targets, respectively. A fraction 
P of the binding sites are at shared target genes, so that 
the number of binding sites at genes that are co-regulated 
by the pair is fi(K\ + 7<2), as illustrated in Figure 1. Loss 
of function mutations occur at binding sites at a rate u\, 
and back mutations, which result in a functional binding 
site being gained at a target, occur at rate u g . An individ- 
ual incurs a fitness penalty s, where 0 < s < 1, for each 
non-functional binding site, and fitness is assumed to be 
multiplicative across loci. Therefore the fitness of an indi- 
vidual that lacks i < K\ + I<2 of its required binding 
sites is Wi = (1 — s) 1 . The fitness landscape associated 
with our model thus has a single peak at i = 0; and for 
each transcription factor binding site that is lost, fitness is 
reduced by an additional factor (1—5). Empirical estimates 



of the strength of selection on transcription factor binding 
sites suggest that typically Ns ~ 10 [18], suggesting that s 
is small. We assume that s is the same for all binding sites, 
an assumption which we relax in the Methods section. 

We consider a population of N asexual individuals. 
The evolution of the population can be described by 
keeping track of the relative abundances of each "ham- 
ming class" [19-21]. Hamming class i corresponds to 
those individuals who currently lack i transcription fac- 
tor binding sites. We denote the frequency of indi- 
viduals in hamming class i by x\. In an infinitely 
large population, the evolution of hamming class i is then 
described by the differential equations [20,21] 

Ki+K 2 

Xi = V -^ZiPij, (1) 
;=0 

where w = Ylflo^ w i x i> an< ^ Pij ls tne probability 
a genotype lacking ; functional binding sites mutates 
to a genotype lacking i functional binding sites (see 
Methods). Previous work [19-21] has shown that at equi- 
librium, when rates of forward and back mutations are 
identical {u\ — u g ), the solution to Equation 1 is a binomial 
distribution. In the more general case of a finite popula- 
tion, with ui 7^ Ug, we find that the equilibrium continues 
to be well approximated by a binomial distribution, with 
mean (K\ + K2)a s . The term a s is the probability that a 
binding site will be non-functional in a randomly chosen 
individual at equilibrium. The probability a s depends on 
the strength of selection against non-functional binding 
sites, 5, population size, N, and the rates of forward and 
back mutation, u\ and u g (see Methods and [20,21]). 

The equilibrium distribution above describes how sta- 
bilizing selection determines the frequencies of functional 
binding sites in a population. The associated mean fitness 
for a pair of transcription factors that do not bind cooper- 
atively is w = (1 — a s s) Kl+K2 (see Methods), and the mean 
fitness contribution of each binding site is 1 — a s s. We are 
typically concerned with the case in which u\, u g s. In 
this case, when 2Ns > 1, a s can be approximated by 

1 Ul Ul 

a s « — + — 

2Ns ui + u g s 

and otherwise by 



ui + u g 

(see Methods). These equations have an intuitive interpre- 
tation: When 2Ns > 1 the first term describes the effect 
of genetic drift which tends to push the system towards 
its neutral equilibrium, ao = ui/(ui + u g ), and the second 
term describes the effect of selection. In the limit N —> oo, 
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a s equals ui/s, which is the standard result for the fre- 
quency of a deleterious allele in an infinite population 
under mutation-selection balance. When 2Ns < 1, evolu- 
tion is nearly neutral and drift dominates, so the system is 
close to the neutral equilibrium de- 
stabilising selection with cooperative binding 
Here we modify our model to account for cooperative reg- 
ulation by a pair of factors. This allows us to ask when 
cooperative regulation is favored by evolution. A muta- 
tion that results in cooperative binding between a pair 
of transcription factors has two effects on the fitness of 
a transcriptional circuit. For a target that is regulated by 
both transcription factors, we assume that cooperative 
binding mitigates the effects of deleterious mutations at 
transcription factor binding sites [7-9]. This results in a 
reduced fitness penalty for a mutation at the j3(K\ + K2) 
shared targets, so that (1 — s) is replaced by (1 — hs) for 
some constant 0 < h < 1. Nonetheless, there are also 
(1 — p)(K\ + K2) targets that are regulated by only one 
or the other of the transcription factors. We assume that 
the cooperative binding of the transcription factors causes 
pleiotropic mis-regulation at these targets (since the other 
transcription factor, which does not have a binding site 
at such sites, now binds to the first transcription factor 
through a physical interaction). This results in a fitness 
penalty t at each of the (1 — fi)(K\ +K2) targets that are not 
co-regulated. Fitness is again assumed to be multiplicative, 
so that the cost of pleiotropy associated with cooperative 
binding is (1 - f)(i-/*)(*i+*2). 

Provided ui, u g 1> genes that are co-regulated and 
genes that are not co-regulated have equilibrium distri- 
butions described by independent binomial distributions 
with means a^s and a s respectively, which are approxi- 
mated by Equation 2 (substituting hs for s appropriately, 
see Methods). We can now specify the conditions for 
the invasion of cooperative gene regulation. A mutation 
resulting in cooperative binding between a pair of fac- 
tors will be favoured if the expected fitness of the mutant 
is greater than the equilibrium mean fitness. Using the 
expressions for mean fitness given above, this occurs when 
(1 - a s5 )/K*i+*2) < (i _ 0 (1-WM 2)(1 _ ashs) P{K l+ K 2 ) t 

Assuming t,s 1, this expression can be simplified to 
give ft > t+sa^i-k) ' Th* s means that, when the frac- 
tion of binding sites at shared targets, /3, is greater than 
a threshold depending on s, /z, t and a s , a mutation that 
results in cooperative binding can invade a population at 
equilibrium. 

Similarly, a mutation that results in the loss of cooper- 
ative binding in a population where it is present will be 
favoured when (l-a hs s)^ Kl + I<2) < ( 1 - 0 (1 +/<2> ( 1 - 
ap ls hs)^ I<1 ~ l ' K2 \ Again assuming t,s 1> this expression 
can be simplified to give ft < t+sa *(i_fy so that, when 
the fraction of binding sites at snared targets, ft, is less 



than a threshold depending on s, /z, t and a^, a mutation 
that results in loss of cooperative binding can invade a 
population at equilibrium. 

Since the first expression in Equation 2 is monotonically 
decreasing in s, and the second expression is indepen- 
dent of 5, it is always true that a^ s < a s , i.e populations 
that have cooperative binding accumulate more deleteri- 
ous mutations, that result in weaker transcription factor 
binding sites, than populations that lack it. As a result 
there is a range of /3 for which both a population that lacks 
cooperative binding, and a population that has coopera- 
tive binding are not invadable by mutations that gain or 
remove cooperative binding respectively. In this range, the 
evolutionary dynamics of the system are bi-stable. In this 
range, we expect to find some genes that are regulated by 
pairs of transcription factors that act cooperatively and 
some that dont. 

Using the expression for a s given in Equation 2, and 
recalling that ao = ui/(ui+u g ) is the neutral equilibrium in 
a system dominated by drift, the threshold value of 
above which selection favours a mutation causing coop- 
erative binding in a population that lacks it, is given by 



P 



2Nt 
2Nt+aoO—h) 



if 2NS > 1 



otherwise 



(3) 



t-\-sao(l— h) 

Similarly, the threshold value of below which selection 
favours a mutation resulting in loss of cooperative binding 
in a population that has it, is given by 

2Nth 

(4) 

otherwise 



P 



2Nth+a 0 (l-h) 



t+sao(l—h) 

These equations allow us to make a number of observa- 
tions about the evolution of cooperative gene regulation 
(Figure 2, and see Methods). Beginning with Equation 3 
for a population lacking cooperative binding, we see that 
when N and/or s is large, so that 2Ns > 1, the threshold 
number of shared targets /3 above which cooperative bind- 
ing becomes advantageous is independent of the strength 
of selection s (Figure 2a). However the threshold decreases 
as the mutation-buffering effect of cooperative binding 
increases (i.e. as h decreases, Figure 2b). As population 
size N increases, selection becomes more efficient and 
the threshold value of /3 increases (Figure 2c). Finally, 
the threshold also increases with the cost of pleiotropy 
t (Figure 2d). In contrast, when N and/or s is small, so 
that 2Ns < 1, drift dominates and the threshold num- 
ber of shared targets /3 is independent of population size 
N (Figure 2c). However the threshold decreases with the 
strength of selection s (Figure 2a), because when drift 
dominates the number of deleterious mutations is at the 
neutral equilibrium, and increasing s increases the impact 
of each mutation on overall fitness. 
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Cost of pleiotropy, t Population size, N 

Figure 2 Evolutionary parameters that permit cooperative regulation. Evolutionary parameters that permit the evolution of gene regulation 
by cooperative transcription factors. Threshold number of shared targets for gain (black) and loss (red) of cooperative binding to be advantageous 
in a population at equilibrium under stabilising selection. The black line shows the value of ft above which a new mutation that results in 
cooperative binding will invade in a population that lacks cooperative binding. The red line shows the value of p below which a mutation resulting 
in loss of cooperative binding will invade, in a population that has cooperative binding. For values of p that lie in the gray region, the dynamics are 
bistable: a population with cooperative binding will preserve it, and one without binding will not gain binding. The threshold fraction of shared 
targets varies with (top left) strength of selection, s, (top right) strength of cooperativity in reducing the effects of deleterious mutations 1 /h, 
(bottom left) the cost of pleiotropy t and (bottom right) the population size, N. Lines show our analytic equations (Equations 2 and 3), and points 
show the results of 1 0 5 replicate Monte-Carlo simulations. Parameter values (unless stated otherwise) are u\ = 2 x 1 0 -7 , u g = 1 0 -7 , Ki + K 2 = 1 00, 
s = ]0~ 3 ,h= ]0~\t= 10- 4 and/V= 10 4 . 



Similarly, from Equation 4 for a population with coop- 
erative binding, we see that when N and/or hs is large, 
so that 2Nhs > 1, the threshold number of shared 
targets below which cooperative binding becomes dis- 
advantageous is independent of the strength of selec- 
tion s (Figure 2a). As before, the threshold decreases 
as the mutation buffering effect of cooperative binding 
increases (i.e. as h decreases, Figure 2b) and the thresh- 
old increases with population size N (Figure 2c), and 
the cost of pleiotropy t (Figure 2d). In contrast, when N 
and/or hs is small, so that 2Nhs < 1, drift dominates 
and the threshold number of shared targets /3 is inde- 
pendent of population size N (Figure 2c), but decreases 
with the strength of selection 5 (Figure 2a). The size 
of the bistable region is largest when s is large and h 
is small, and for intermediate values of N and t, as 
shown in Figure 2. As this analysis demonstrates, there 
is a broad range of possible evolutionary outcomes and, 
crucially, cooperative binding can evolve under a wide 
range of circumstances despite the deleterious pleiotropic 
effects associated with physical interactions among 
transcription factors. 



Adaptation of transcriptional circuits under positive selection 

When cooperative binding is present, under stabilis- 
ing selection, transcription factor binding sites at co- 
regulated genes are better able to tolerate mutations 
(i.e ah s > a s ). Under positive selection for a novel expres- 
sion phenotype, this may speed adaptation, since greater 
mutational robustness generates greater genetic diver- 
sity and can help speed adaptation (Figure 3a) [22]. This 
may occur, for example, when adaptation involves change 
in the transcription factor that regulates a target gene 
[7-9,11], through turnover of transcription factor bind- 
ing sites [23-25]. We use our model to quantify the extent 
to which cooperative binding among transcription fac- 
tors accelerates the adaptive rewiring of transcriptional 
circuits under positive selection. 

We study adaptive change that involves replacement of 
an existing transcription factor by a new one that con- 
fers higher fitness. We assume that the target gene must 
first suffer an initially deleterious mutation at its exist- 
ing binding site before a newly adaptive binding site can 
be acquired (Figure 3) [8,9,11]. The newly adaptive bind- 
ing site is produced from binding sites that have already 
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Rewiring with Rewiring without 

Cooperativity Cooperativity 




Figure 3 A schematic cartoon of rewiring. A schematic cartoon of rewiring with (left) and without (right) cooperative binding. Selection favours a 
change in the regulation of target genes from the red TF to the green TF. Rewiring requires an initially deleterious mutation at the red binding site 
before a green binding site can be acquired. The fitness of the different states is shown on the left hand side for each case. The reduced fitness of 
the intermediate state is less when cooperative binding is present than when it is absent. 



mutated at a rate u r . The expected waiting time for such 
a gene to produce a newly adaptive binding site therefore 
depends on the number of binding sites in the population 
that harbor a deleterious mutation, which is proportional 
to a s when cooperativity is absent and a^ s when it is 
present. Since a^ s > a Si this number is greater when 
cooperative binding is present than when it is absent. 

The ratio of waiting times before a newly adaptive bind- 
ing site arises, t*/t r (for populations without, t*, or with, 
£ r , cooperative binding), quantifies the degree to which 
cooperative binding of transcription factors accelerates 
adaptation under positive selection. This ratio is given 
by ah s /a s (Figure 4, see Methods). As Figure 4 shows, 
provided Ns > 1 (i.e. provided deleterious mutations at 
binding sites are not nearly neutral), rewiring of transcrip- 
tional circuits is significantly accelerated by cooperative 
binding among transcription factors. Thus, a population 
that has cooperative binding among transcription fac- 
tors under stabilizing selection, can also experience an 
accelerated rate of adaptation. 

Cooperative binding and the fraction of shared targets in 
yeast 

Our model predicts that, under stabilising selection, coop- 
erative binding will be favoured when the fraction of 
targets shared by a pair of transcription factors exceeds 
a certain threshold. In order to test this prediction, and 
to get some idea of the degree of overlap that is required 
for cooperative binding to arise in natural systems, we 
inspected pairs of transcription factors in Saccharomyces 
cerevisiae. A total of 186 pairs are reported as participat- 
ing in cooperative binding [26], based on a combination 
of ChlP-chip data, transcription factor knockout data, 
and direct experimental evidence. Using the set of genes 



regulated through a transcription factor binding site for a 
total of 204 yeast transcription factors [27,28], we deter- 
mined the fraction of overlapping targets, /3, for all pairs 
of transcription factors (Figure 5). It is important to note 
that, typically, studies that systematically look for coop- 
erative gene interactions take into account the number 
of targets shared by a gene pair. Therefore, to minimise 
the risk of circularity in our analysis, we have used sepa- 
rate datasets to determine cooperative gene interactions, 
and to determine regulatory targets. The mean fraction 
of overlapping targets for genes identified as participating 
in cooperative binding was 10-fold greater (0.21) than the 
mean fraction of overlapping targets at genes that do not 
bind cooperatively (0.02) which is highly statistically sig- 
nificant (p < 2 x 10 -16 , Wilcoxon test). This supports the 
prediction of our population-genetic analysis, and it sug- 
gests that a sizeable overlap in targets is required before 
cooperative binding becomes advantageous. 

Cooperative binding in the yeast sex determination 
network 

The ability of cooperative transcription factors to facili- 
tate adaptation also has empirical support, from obser- 
vations in the sex determination networks of different 
yeast species [7-9]. The acquisition of a protein-protein 
interaction between the mating factor MATo?2 and Mcml 
was able to buffer the deleterious effects of mutations 
that strengthened Mcml binding sites [7]. Prior to the 
emergence of a protein-protein interaction, sex determin- 
ing genes were activated only in the presence of Mcml 
and MATa2 together [7]. The buffering effects of the 
protein-protein interaction allowed Mcml binding sites 
to acquire strengthening mutations such that sex deter- 
mining genes became activated by Mcml alone. As a 
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Selection strength, s 

Figure 4 Cooperative binding accelerates adaptation. Cooperative binding accelerates adaptation under positive selection. The ratio of waiting 
times before the arrival of novel adaptive binding sites for populations without (t*) and with (t r ) cooperative binding. Provided Ns > 1 , cooperative 
binding reduces the adaptation time up to 1 0-fold, compared to populations that lack cooperative binding. The line shows our analytic expression, 
and points show the result of 1 0 5 replicate Monte-Carlo simulations. Parameter values u\ = 2 x 1 0 -7 , u g = 1 0 -7 , K\ + K2 = 1 00, h = 1 0 _1 , 
r= 1(T 4 ,/V = 10% r = 1(T 7 . 



result, MATa2 became redundant and was lost [7]. The 
result was a significant upstream reorganization of the 
yeast sex determination network without the need for 
any parallel changes to the downstream output of the 
network. Similar patterns, in which acquisition of coop- 
erative binding between transcription factors is followed 
by changes to the regulation of their shared targets, are 
observed across the yeast transcriptome [8], and support 
the prediction of our analysis of positive selection on 
transcriptional networks. 



Conclusions 

We have shown that cooperative binding between a 
pair of transcription factors is favoured under stabilis- 
ing selection, provided the overlap between their tar- 
gets is sufficiently large. The threshold fraction of shared 
targets depends upon the strength of selection on bind- 
ing sites, the cost of pleiotropy associated with protein- 
protein interactions, and the rates of mutations. It also 
depends on the population size. Just as in models that 
consider the evolution of redundancy [20,29], we find that 
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Figure 5 Number of shared targets. Fraction of targets that are shared between pairs transcription factors in S. cerevisioe [26-28]. (left) The fraction 
of targets that are shared among paris of transcription factors that lack cooperative binding and (right) the fraction of targets that are shared among 
transcription factors that bind cooperatively. The fraction of targets that are shared is larger among cooperative factors (p < 2 x 1 0 -16 , Wilcoxon test). 
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greater redundancy (i.e. cooperative regulation) is more 
strongly favoured in smaller populations; and that for 
intermediate population sizes the evolutionary dynam- 
ics are bistable, such that cooperative binding is main- 
tained if it is already present, but cannot evolve if it is 
absent. Finally, we found that cooperative binding facili- 
tates the rewiring of transcriptional circuits under positive 
selection. 

This study shows that, even when the deleterious 
effects of pleiotropy are taken into account, mutations 
that change transcription factor function can play an 
important role in the evolution of gene expression. 
Taking account of mutations both to regulatory bind- 
ing sites and to the transcription factors themselves 
reveals a rich set of evolutionary dynamics that helps 
explain how complex transcriptional networks can rapidly 
rewire large sets of genes in order to adapt to new 
environments. 



Methods 

Equilibrium distribution 

To find the equilibrium relative abundances of the ham- 
ming classes X{ that give the solution to Equation 1, 
we follow [20,21] and look for a solution of the form 

Xi = (^ Kl | I<2 ^ 4(1 - a s ) K ^-\ Given this assumed 

form, the mean fitness of the population at equilibrium 
is to = 2^(1 — s) l Xi. Since J2i x i = 1 it is easy to 

show that J2i (^ Kl Y <2 ^) pi = (1 + p)n * Takin 3 p = 
a(l — s)/(l — a), this gives a mean fitness of a> = 
(1 - a s s) {Kl+K2 \ which is the form given in the main 
text. 

To compute a s we follow [20] and write down the gen- 
erating function Tly(z) of a random variable V defined 
by the Hamming class after mutation of an individual 
chosen from the population according to its relative fit- 
ness, where z is a formal variable. The function Ylyiz) 
may be thought of as a probability generating func- 
tion, where the probability distribution associated with 
it gives the distribution of Hamming classes in the 
population at equilibrium [20]. In the case of non- 
identical forward and backward mutation rates, u\ and 
Ug, this is given by flyiz) = ^iW{Xi(ui + (1 — 
ui)z)\(l - u g ) + UgZ) Kl+K2 ~ l . Following [20], we analyse 
the eigensystem problem associated with the popula- 
tion dynamics to determine the equilibrium distribution 
of Hamming classes (i.e the distribution of genotypes 
in the population under mutation selection balance). 
In equilibrium we have flyiz) = °k^ i X{z\ where X is 
the eigenvalue associated with the system. Using our 
assumed form of X{ results in the infinite population 
equilibrium distribution: 



a s = — 
2 



Ul + Ug 

1+ - Ua 



-J1 + 



Ul + 



Ug \ 2 4iui 



(5) 



When cooperative binding is present a subset f3(K\ + 
K2) = Khs of the target genes have selective coefficient 
hs and the remaining (1 — p)(K\ + K2) = K s have selec- 
tive coefficient s. The hamming class of an individual now 
has two indices i and j such that Xy refers to an individ- 
ual with i mutations at shared targets and j mutations at 
unshared targets. In this case we look for solutions of the 

The generating function of V is now given by 
n V (z hs ,Z s ) = ^2^2 W V X iMl + (! ~ u l) z hsY 



form 



x ((1 - Ug) + UgZ h s) Khs ~\ui + (1 - ui)z s y 

X ((1 - Ug) + UgZ s ) KH 

and at equilibrium f\y{zh si z s ) = X J2i Hj x ij z \s^ 
Because we are assuming that wy is just the product of the 
two independent fitness landscapes associated with the 
different selective coefficients, i.e Wy = (1 — hs) l (l — sy 
using our assumed form of Xij results in values of a s and 
a^s as given by Equation 5 for the independent distribu- 
tions with the appropriate selective coefficients. 

The finite N approximation of Equation 5, can be 
obtained from the moment equations of Woodcock and 
Higgs [21], assuming ui, u g ,s,N~ l <^ 1. This gives 



1 + 



1 + 2(ui + Ug) 
2Ns 



-J1 + - 



l + 2(ui + u g )y 4>m l+2(ui + Ug) 



2Ns 



Ul + Ug 



2Ns 



(6) 



Assuming u\, u g 5, we obtain the Taylor expansion 
of a s to first order, in terms of 1/ (2Ns) (which is relevant 
when 2Ns > 1) and in terms of 2Ns (relevant when 2Ns < 
1) to obtain Equation 2. 

Using the above distributions, the equilibrium mean fit- 
ness i n the absence of cooperative binding is W[ n d = 
(l-a s s) Kl+K2 ,zn& in the presence of cooperative binding, 
w coop isw coop = (l-t^-^+^) a - asS) a-mK 1+ K 2)(1 _ 

a hs hs) mi+I<2) which can be Taylor expanded to first order 
and, combined with Equation 2, give Equations 3 and 4. 
We can also find the conditions for the equilibrium mean 
fitness of a population with cooeprative binding to be 
greater than that for a population that lacks it, i.e w coop > 
w ind- When « 1 this can be expressed as 
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2Nt 



2Nt+aoO—2Nhs) 
1 otherwise 



if 2Afe > 1 and2M*s < 1 



(7) 



According to this inequality, cooperative binding is 
advantageous only when the fraction of targets shared by 
the pair of transcription factors is greater than a threshold. 
Since by definition ft < 1, Equation 7 says that cooper- 
ative binding can only increase population mean fitness 
when 2Nhs < 1 and 2Ns > 1, i.e if the benefit of cooper- 
ativity, h, is sufficient to make mutations at transcription 
factor binding sites that are deleterious in the absence of 
cooperativity nearly neutral when it is present. 

Rewiring time 

For a given binding site the waiting time T for the arrival 
of the first adaptively rewired mutant to arise is given by 
the distribution 



where t is time, f is the rate at which the rewiring muta- 
tions occur and Y is fraction of the population at equi- 
librium that is able to undergo rewiring mutations. We 
assume rewiring mutations can occur only following an 
initially deleterious mutation. In the absence of coopera- 
tivity, the fraction of the population with a mutation at a 
given site is a s , therefore Y oc Na s , and we are able to write 



F{T > t} = E[e- UrNa * t ], 



where u r gives the rate at which rewiring mutations occur 
at sites that have already undergone an initially deleterious 
mutation. The excepted waiting time for a single gene is 
thus 

u r Na s 

If the gene to be rewired is coregulated by a pair of 
transcription factors that bind cooperatively, we similarly 
have 

Ths = 



u r Nafo s 

and the ratio of waiting times T s /Tk s is therefore simply 
ah s /a s . Finally, if k genes must be rewired before adap- 
tation occurs, the waiting time for the first event is T/k 
(where T is the waiting time for 1 gene to rewire), the 
waiting time for the second event is T/(k — 1) and so on. 
Therefore the total expected waiting time, T s (k) is 

1 * 1 
u r Na s ~^ i 

Therefore the ratio of expected waiting times with and 
without cooperative binding is independent of the num- 
ber of genes to be rewired, and depends only on the ratio 



Variation in selection strength across sites 

Up to this point we have assumed that the selective coeffi- 
cients, s and h, are constant across binding sites. However 
it is obviously possible that these parameters may vary 
between binding sites. Such a generalization of our model 
represents a significant complication, and a full treatment 
is outside the scope of this paper. However, it can be ana- 
lyzed in the simple case that the coefficients associated 
with each binding site i satisfy 5/ 1, such that the fitness 
landscape is approximately additive. 

We assume that there are a finite set of selective coeffi- 
cients, s a and h y , where the super-scripts a and y index 
the different sets, and that the number of binding sites 
with a given coefficient s a or h Y are distributed according 
to some function F(s) and G(h). We also assume that the 
coefficients s and h are distributed independently of one 
another. Each binding site i has a value s; associated with 
it, drawn according to F(s). In the quasi-species regime 
the probability that binding site i has a mutation is sim- 
ply given by a Si , as given by Equation 5. In this case the 
distribution of hamming classes, rather than being bino- 
mial, is poisson binomial, paramaterized by a Si . Similarly, 
when cooperative binding is present, the distribution is 
poisson binomial with the modification that shared tar- 
gets have a mutation with probability a Si h. where hi is 
drawn independently from the distribution G(h). The sys- 
tem is easiest to analyse if we separate binding sites into 
sub-classes, a, of binding sites with the same selective 
coefficients, where the size of each sub-class, n a is given 
by n a = F(s a )(Ki +K 2 ). The number of mutations in each 
subclass is then given by a binomial distribution. 

When the fitness landscape is close to additive, the 
method of [19] can be applied independently to each sub- 
class to determine the expected number of mutations in 
the sub-class. This is only true in an additive fitness land- 
scape, or in a multiplicative fitness landscape in which 
cross terms between subclasses are sufficiently small that 
they can be neglected. When this condition holds, the 
value of a s a associated with each sub-class is given by 
Equation 6. 

The expected number of mutations in each subclass is 
simply a s an s a and the expected fitness of each sub-class 
is (1 — a s as a ) na . The expected mean fitness of the pop- 
ulation is then cb = Y\ a (l — a s as a ) na . Using our almost 
additive assumption, this can be approximated by cb ~ 
1 - (Ki + K 2 ) £ a F(s°> 5 <*s°\ This is the fitness of the 
population when cooperative binding is absent. Similarly, 
when cooperative binding is present we have, 

wcoop « 1 - (1 - P)(Ki + K 2 )t - (1 - P)(Ki + K 2 ) 

x J2 p ( s ^ a s asa + P(Ki + K 2 ) 

a 

y.Y^J^F(s a )G(h y )a h y ; f,h Y s a 

a y 



Stewart et al. BMC Evolutionary Biology 201 2, 1 2:1 73 
http://www.biomedcentral.eom/1 471 -21 48/1 2/1 73 



Page 10 of 10 



From these the invasion probabilities and threshold values 
of /3 can be calculated in the same way as in the case of 
constant 5 and h above. The only difference in the case 
with variable selective coefficients is that the invisibility 
criteria for a mutation resulting in gain or loss of cooper- 
ative biding dependants on the average of a s s across the 
distribution F(s), and on the average a^hs across the joint 
distribution F(s)G(h). Investigating different forms of the 
functions F(s) and G(h) represents an interesting avenue 
for further work. 
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